EPJ manuscript No. 

(will be inserted by the editor) 



o 
o 

Dependence of the average to-node distance on the node degree 

< 

o for random graphs and growing networks 

K.Malarz and K.Kulakowski 

O ■ 

<D ; 

^Zj Faculty of Physics and Applied Computer Science, AGH University of Science and Technology, al. Mickiewicza 30, PL-30059 



XJ1 

a 

C 

o 

(N 
> 

^t- 

(N 
O 

o 



-a 
c 

o 
o 



Krakow, Poland, e-mail: malarz@agh.edu.pl, kulakowski@novell.ftj.agh.edu.pl 

Received: date / Revised version: date 

Abstract. In a connected graph, nodes can be characterised locally (with their degree k) or globally (e.g. 
with their average length path £ to other nodes). Here we investigate how £ depends on k. The numerical 
algorithm based on the construction of the distance matrix is applied to random graphs and the growing 
networks: the scale-free ones and the exponential ones. The results are relevant for search strategies in 
different networks. 
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1 Introduction found to appear in several existing networks, as the ac- 

tor collaboration network, the WWW, and the power grid 
Recent interest in analytical and numerical research of network 4 . Now, many other examples of this universal 
growing networks fDI31l3| was initiated by a seminal pa- pattern has been discovered 0, and the list seems still 
per of Barabasi and Albert 0). The authors demonstrated °P en - The idea of a growing network emerges as a new 
that a natural algorithm of growing produces a scale-free paradigm of interdisciplinary importance, 
power law distribution of the node degree, i.e. of the num- The growing process is understood as a successive adding 
ber of edges of a node. Moreover, this power law has been of new nodes, each linked to older ones by m edges. When 
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m = la so-called tree appears. A tree is a connected graph 
without cyclic paths. For m > 2 — when a newly attached 
node is linked to more than one node — cyclic paths are 
possible and the formed structure is termed as a simple 
graph 0017]. While the network — a tree or a simple 
graph — grows, existing nodes to which the new ones are 
linked can be selected preferentially i.e. with the proba- 
bility proportional to their degree. In this case, the degree 
distribution is given by the power law, i.e. P(fc) oc fc~ 7 . If 
the nodes are selected randomly, the degree distribution 
is exponential, i.e. P(k) = 2~ k . 

One of the striking features of many growing networks 
is the small- world effect |S] ■ Namely, in such networks the 
mean distance d between nodes increases with the number 
N of nodes only as ln(7V) or slower. For example, the actor 
collaboration network is formed of 449 thousand nodes; 
two actors are linked if they happened to play roles in the 
same movie. The mean distance d, i.e. the mean number 
of links between actors, is less than 3.5 

Actually, the small-world effect in human relations has 
been discovered more than 35 years ago in a brilliant so- 
ciometric experiment |51ll(J| . A group of individuals was 
asked to send a letter to a target person in Boston via an 
acquaintance who was supposed to be closer to the tar- 
get than the sender. The mean length of the letter chain 
was less than seven. This experiment was repeated several 
times , and it is currently being continued at Columbia 
University • Recently, such considerations happened to 
inspire a hierarchical model of a social network |13| , where 
a contact between different groups within a given hier- 
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archy is possible only via a person who is higher in the 
hierarchy. 

In this kind of contact experiment, to find an appro- 
priate next person in the path is a nontrivial task, and 
several strategies are possible |14U15) . One of the most ob- 
vious is to find a person most connected, i.e. a neighbour- 
ing node with the highest degree. This strategy has been 
shown to be effective in networks with power-law degree 
distribution, but not in random graphs I14j . We note here 
that all strategies must be ceased once the desired target 
is in a reasonably short distance. The discussion below is 
conducted with this condition in mind. 

In this paper, the problem addressed is if this strategy 
is effective in the exponential networks. However, our nu- 
merical method is different from the approach applied in 
Rcf. |14U15| . Here we construct the distance matrix for a 
given network. For each node i having the degree k, we 
calculate the mean distance £j to all other nodes in the 
network. For a given kind of network (say, scale-free net- 
works) we calculate the average of over all nodes with 
given degree k. In this way we get a curve £(fc). The av- 
erage distance d can be obtained by averaging over 
k. It is obvious that £ decreases with k, because on av- 
erage, the paths from more connected nodes are shorter 
than the paths from a node with one or two edges only. If 
this decrease is sharp, the strategy of the most connected 
neighbour (MCNS) is effective, because the path from the 
selected neighbour to other nodes is shorter on average. 

It is worth mentioning here, that is a direct mea- 
sure of the so-called closeness centrality (CC) for a given 
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node ^Hj- A node with high CC is obviously in a good For the exponential networks, the nodes to which new 
position to get other nodes on short paths. The MCNS nodes are attached are selected randomly. For the scale- 
strategy (termed as MAX in Ref. ^5]) is just to increase free networks, these nodes are selected preferentially, i.e. 
the node degree. The slope of the curve £(fc) then brings with the probabilities proportional to their degree 0j. 
information on how this strategy is effective for a given For the growing networks, the starting point of the 
network. The effectiveness of MCNS for nodes of given k simulation is a matrix 



can be evaluated by an index 



'01 



^ m 



1 / 

representing only two nodes linked together. The subse- 
In principle, the total effectiveness for a given kind quent stages of the construction of the matrix S for grow- 
of network should be calculated as an average over all ing trees (to = 1) and growing simple graphs (to = 2) were 
nodes. In such an average, the majority of nodes have a described in Refs. |2~Tll22ll23| . Here, we present a similar 
low degree. Then, what is relevant is the value of r\ for low algorithm for the construction of the distance matrix S 
fc. Instead of averaging, we show that curve rj(k) carries for Erdos-Renyi CRG [T7I1TF| . 

all the important information. We start the simulation with an N x N matrix with 

In the next section, we describe our method of sim- all non-diagonal elements equal to N, which is larger than 
ulation. Later on we show the results for the scale-free th e largest possible distance between any of N connected 
networks, the exponential networks and connected Erdos- nodes. Then — following the definition of CRG — we try 
Renyi random graphs (CRG) [T1ll7U18| . The section is to link each node pair randomly with a given probability p. 
closed by a discussion. Strictly speaking, we go through all non-diagonal elements 

of S and set s(i,j < i) equal to one with the probability 
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p. Obviously, the matrix S is kept symmetric. Each time, 
when a new edge is added, we have to rebuild the whole 
matrix S due to the link between nodes i and j: 

A standard way of calculating distances between two nodes 

is the breadth-first search algorithm pEH| . Our numeri- VI < to, n < iV : s(m, n) = min (s(m, n), 

cal approach is based on the construction of the distance s(m, i) + 1 + s(j, n), s(m, j) + 1 + s(i, n)j . 

matrix S, an element of which s(i,j) indicates the length After such a procedure the matrix Snxn contains ele- 
of the shortest path between nodes i and j. The matrix S ments equal to N only if the graph is not connected, 
is formed simultaneously with the network growth [2111221 One could ask if the order of updating the matrix ele- 
I2.'ij . ments could change the final result. Our answer is no, and 
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the argument is as follows. Adding an edge, say (m,n), n=io 3 , N run =io 7 (m=i), N mn =io 3 (m=2), N run =ioo (crg) 

18 | , . — 

i , , , r i /• -\ j.i • • r,i scale-free trees (m=1, N=10 , N mn =10) 

we have to check lor each pair the minimum of the 16u 

H H scale-free graphs (m=2) 

following: s(i,j) before adding a new edge (to, n), which 14 " °° n exponential trees (m=i, n=io 4 , N mn =io) 

h h exponential trees (m=1 ) 

11 B exponential graphs (m=2) 

does not contain this edge by definition; s(i, m)+l+s(n,j) 1Q 4 crg(p=o,02) 

g + + + " CRG(p=0.05) 

and s(i,n) + 1 + s(m,j). The path s(m,n) is represented 8 " • * lG <P=0A N ' un=1) 

6 - " " " »«>.„„ ■ ' * *M 

» ■ i * 

above as a unit. No other part s of the path selected as 
the minimal one contains the edge (to, n); if the path does 
contain it, it contains it twice and therefore it is not mini- 
mal. In other words, there are two possibilities: either the 
minimal path does not contain the new edge (m,n), 
or it contains it once. Then, all paths s used in Eq. @ and 

selected as minimal do not contain the new edge. Then, p(fc) = exp( _ (fc)) . (jfc) fc/ fc!j with {k) ra 20 and {k) M 50 
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Fig. 1. The average distance f (k) to a node with given degree 
k for different networks. 



they are not changed by adding this new edge. Therefore for p = Q Q2 and p = Q Q ^ respectively 

the order of updating these parts is not relevant. T _. . .. , x , . . . 

In Iig. |1| the average distance to a node with 

For a given matrix S we calculate the distribution of . . , . , _ . . 

a given degree k is presented. Each network contains at 

node degree P(k) and the average distance £ (k) to a node , . . . T _. . . . 

least a thousand nodes. In l*ig. |2|the dependence r/(k) is 

for a given k. Note that the number '1' in i-th row/column , _ . , AT 1a7 in , . 

shown. I he results are averaged over N run = 10 , 10 and 

of the matrix S gives the degree of i-th node. On the other , .._ . . , , , „„„ 

100 different networks for trees, simple graphs and CRG, 

hand, the mean s(i, j)/N of matrix elements in i-th . , 

^i- L respectively. 

row/column is the average distance ^ to that node. 

As it was explained at the end of the Introduction, the 

The results are averaged over N run independent net- 
most relevant are the left part of the curves rj(k), where 

works, i.e. various matrices Sn x n- 

the degree is small. In our search, the results for large k 
reflect the fact that once in our search the nodes of highest 

3 Results and discussion . , , , _ i1 . L , 

possible degree are reached, a further search may not be 

For the scale-free networks we reproduce P(k) cx k^ with efficient. (Note, that in the simulation performed in Ref. 

7 w 2.7, while the theoretical value is 3.0. The numer- HSI, the search was stopped once the distance from the 

ical reduction of 7 is known to be caused by the finite- desired node was one.) 

size effect [l"ll24] . For the exponential trees the node de- For larger networks, the whole plots presented in Figs, 

gree distribution is verified to be P(k) = 2~ k . The de- ^ and [21 are expected to be stretched toward larger values 

gree distribution for CRG follows the Poisson distribution of k. However, this stretching is logarithmically slow. 
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(a) N=1000 



4 



does not depend on the size N of investigated networks. 
However, for the exponential networks with m = 2, the 

CRG, p=0.4, N run =1 >-*--: 

3 - exponential trees, N run =io 7 n _ obtained values of £ depend much weaker on the degree k. 



CRG, p=0.02, N run =1CT 
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exponential graphs, N run =10 
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There, the obtained values of 77 are comparable to those 

of the scale-free trees. Here again, the size of the network 

does not influence the results, but the increase of the num- 

* " ber of links m from one to two leads to a further decrease 

1000 of the index 77. Finally, for the random graphs the mean 

distance £ practically does not depend on k, and the in- 
exponential trees • — a — i 

3 5 h h dex 77 is close to zero. These conclusions on the scale-free 

a H 

3 - - networks and on the random graphs agree with the results 

a 

of Ref. but MCNS applied in an exponential tree is 
even more effective than in the scale-free tree. 
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(b) N=1000, N run =10 7 
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The explanation of the result is as follows. In the scale- 



0.5 I i , , , , , free networks, local fluctuations of the degree are enhanced 

1 10 100 1000 

k 

by subsequent linkings. In this way, the structure becomes 

Fig. 2. The dependence jj(fc) for (a) the exponential networks heterogeneous: multiple centres of high degree can be cre- 

and CRG (p = 0.02, 0.05 and 0.4) and (b) the exponential ated; and the growtn concentrates around these centres. 

trees and the scale-free trees. The numerical uncertainties are Thig hierarchical structure f the scale-free networks was 

smaller than the symbol size, except the last two bins for trees , , , , . r mi , wmra ^ 

■' described recently m Ref. |25J. I hen, MCNS can be mis- 

for which uncertainties are huge and not shown. it .1 

leading, as it always leads to a local centre; however, some- 
times the target is somewhere else. This enhancement 

T n- rriii u r at i t n 4 is absent in the exponential networks, and that is why 

In Fig. [TJ the results for trees with N — 10 and 10 1 J 

j mi j-rr ,1 1 -r. r ,1 MCNS works better there. We note that this argumenta- 
are compared. Ine mam diflerence is just the shift of the 

j i , r . mi i r , 1 tion works well for trees. For other systems, there is more 
curve upward when iv increases. Ine slope of the curve, 

n\ ■ j.i r 11 ,i mi u 1 ,1 than one path between each pair of nodes, and any edu- 

r)(k), is therefore roughly the same. 1 he results show that v f 1 j 

. ,. , , , , /^^/^^l^Tc^ • j. rr i.- r xi cated but general strategy cannot replace the knowledge 

the investigated strategy (MCNS) is most effective lor the 

. ,, , 1 mi i.v. 1 t ,i ■ of where the target is. 
exponential trees, where m = 1. there, the value of the in- 

dex 77 is the largest. This is true in particular about k — 10, Our new tool — the index 77, defined above — seems 

where 77 has a maximum. The existence of this maximum to be useful for comparing different kinds of networks. In 
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a purely geometrical sense, it gives the following informa- 
tion: if a node has more edges, how much closer is it to 
the network centre, where the mean distance £ is minimal? 
From this point of view, the structure of a given network 
can be found to be more or less resistant to damage and/or 
penetration. This problem is of potential relevance for nu- 
merous applications, e.g. in computer science, sociophysics 
and immunology PoTl27ll28j . 

As for our knowledge, the only example of the expo- 
nential network is the electrical power grid in western US 
|29| . However, we know examples where the preference of 
linking is inverted: new nodes are more likely linked, than 
old ones. Such is the case of the diffusion- limited aggrega- 
tion, known as DLA, which leads to a formation of fractal- 
like dendritic molecules [301 • If such an anti-preference is 
possible, it is sure that some networks also exist where 
the preference is absent, or at least small. These latter 
networks should be close to the exponential ones. For ex- 
ample, suppose that a network of actresses is investigated, 
the preference for old nodes could be weaker. 

In conclusion, we have formulated a quantitative crite- 
rion for evaluation the search strategy by linking to a most 
connected neighbour. We demonstrated that this strat- 
egy is more efficient for the exponential trees than for the 
scale- free and random networks. 
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